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We present a search for neutral Higgs bosons <fr decaying into bb, produced in association with 
b quarks in pp collisions. This process could be observable in supersymmetric models with high 
values of tan/3. The event sample corresponds to 2.6 fb _1 of integrated luminosity collected with 
the CDF II detector at the Fermilab Tevatron collider. We search for an enhancement in the mass 
of the two leading jets in events with three jets identified as coming from b quarks using a displaced 
vertex algorithm. A data-driven procedure is used to estimate the dijet mass spectrum of the non- 
resonant multijet background. The contributions of backgrounds and a possible Higgs boson signal 
are determined by a two-dimensional fit of the data, using the dijet mass together with an additional 
variable which is sensitive to the flavor composition of the three tagged jets. We set mass-dependent 
limits on o~(pp — > 4>b) x B(4> —¥ bb) which are applicable for a narrow scalar particle <f> produced in 
association with 6 quarks. We also set limits on tan ft in supersymmetric Higgs models including 
the effects of the Higgs boson width. 
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FIG. 1. Neutral scalar production in association with b 
quarks. 




imal supersymmetric standard model (MSSM) or exten- 
sions thereof. This occurs when tan j3, the ratio of the 
Higgs boson vacuum expectation values for up-type and 
down-type quarks, is large. For tan/3 ~ 40 the cross sec- 
tion is expected to be a few picobarns [I], giving a pro- 
duction rate which could be observable in pp collisions 
at y/s = 1.96 TeV at the Fermilab Tevatron. In large 
tan /3 scenarios the pseudoscalar Higgs boson A becomes 
degenerate with either the light (h) or heavy (H) scalar, 
doubling the cross section. 

In the standard model (SM), the inclusive event yield 
of a light Higgs boson in the bb decay channel is over- 
whelmed by strong heavy-flavor pair production many 
orders of magnitude larger. For this reason, searches for 
Hsm —> bb at the Tevatron rely on associated produc- 
tion modes like WHsm and ZHqm where backgrounds 
are restricted to those also containing a W or Z. In this 
paper we report on a search for <f> — > bb, where <j> repre- 
sents a narrow scalar such as Hsm or the MSSM Higgs 
bosons h/H/A, with the associated production b<fi like- 
wise reducing the large heavy flavor backgrounds. The 
production process is illustrated in Fig.[l] Results for the 
b<f) process in the case of Higgs boson decays to bb have 
been previously obtained by DO [Mlj: and for inclusive 
or ^-associated Higgs boson production in the rr decay 
mode by CDF 0, DO [S[7], and CMS @. 

We search for resonance decays into bb in events con- 
taining at least three 6-jet candidates identified by dis- 
placed vertices ("tagged" hereafter). As the jets resulting 
from the resonance decay are usually the most energetic 
jets in the event, we study the invariant mass of the two 
leading jets in Ex, denoted m\2- A signal would appear 
as an enhancement in the m\2 spectrum. An example 
TOi2 distribution is shown in Fig. [2] 

The background is predominantly QCD multijet pro- 
duction containing multiple bottom or charm quarks. 
Events with single pairs of heavy flavor also enter the 
sample when a third jet from a light quark or gluon 
is mistakenly tagged. We do not have precise a priori 
knowledge of the background composition and kinemat- 
ics, nor do we wish to rely upon a Monte Carlo gener- 
ator to reproduce it well [9TU1|. We have instead de- 
veloped a technique to model the m\2 spectrum for the 
background in the triple-tagged sample in a data-driven 
manner, starting from double-tagged events. 

To enhance the separation between the flavor- 
dependent background components and the possible res- 



FIG. 2. The reconstructed mi2 distribution for simulated 
events containing a 150 GeV/c 2 SM Higgs boson, for all events 
passing the selection criteria and for only those where the two 
leading jets represent the b quarks from the Higgs boson decay 
(70% of events for this mass). No backgrounds are included. 
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onance signal, we introduce a second quantity x ta gs , con- 
structed from the invariant masses of the secondary ver- 
tices constructed from the charged particle tracks in each 
jet, which is sensitive to the flavor composition: three 
bottom quark jets vs. two bottom quarks and one charm 
quark, etc. The kinematic information in 77742 is then 
complemented by flavor information in Xtags- 



With data-driven estimates of the distributions of m\2 
and Xtags for the backgrounds and Monte Carlo models 
for the neutral scalar signal, we perform maximum like- 
lihood fits of the two-dimensional distribution of x tags 
versus mi 2 in the data to test for the presence of reso- 
nances in the triple-tagged sample. These fits are used to 
set limits on the cross section times branching ratio for 
cr(pp — > b(j>) x B(4> — > bb) and on tan /3 in MSSM scenar- 
ios. Although the procedure has been optimized for the 
case of production of a single resonance with the decay 
products predominantly represented by the two leading 
jets in the event, the results can also be interpreted in 
models of new physics with similar final states such as 
pair production of color octet scalars [T2HI4] . 



In Sec. [IT] we briefly describe the CDF II detector sub- 
systems upon which this analysis relies. We discuss the 
data sample and event selection requirements in Sec. Ill 



A description of the signal simulation used for the search 
is found in Sec. |IV| T he data-driven background model is 
presented in Sec. |V| The systematic uncertainties on the 



signal and background estimates are discussed in Sec. [VI 
The results for the standard model and MSSM interpre- 
tations are shown in Sec. IVIll In Scc. lVIlH we summarize 
and conclude. 
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II. THE CDF II DETECTOR 

The CDF II detector is an azimuthally and forward- 
backward symmetric apparatus designed to study pp col- 
lisions at the Fermilab Tevatron. Details of its design and 
performance are described elsewhere |15j , here we briefly 
discuss the detector components which are relevant for 
this analysis. The event kinematics are described using a 
cylindrical coordinate system in which <f> is the azimuthal 
angle, 9 is the polar angle with respect to the proton 
beam, r is the distance from the the nominal beam line, 
and positive z corresponds to the proton beam direc- 
tion, with the origin at the center of the detector. The 
transverse r — 4> (or x — y) plane is the plane perpendic- 
ular to the z axis. The pseudorapidity r\ is defined as 
— ln(tan(0/2)). The transverse momentum of a particle 
is defined as px = psin# and the transverse energy as 
E T = Esm9. 

A magnetic spectrometer consisting of tracking de- 
vices inside a 3-m diameter, 5-m long superconducting 
solenoidal magnet with an axial magnetic field of 1.4 T 
measures the momenta and trajectories of charged par- 
ticles. A set of silicon microstrip detectors (LOO, SVX, 
and ISL) [TB] reconstructs charged particle trajectories 
in the radial range 1.5-28 cm, with a resolution on the 
particle position at its closest approach to the beamline 
of 40 /xm in the transverse plane (including a 30 /im con- 
tribution from the size of the beam spot). A 3.1-m long 
open-cell drift chamber (COT) [17] occupies the radial 
range 40-137 cm. Eight superlayers of drift cells with 12 
sense wires each, arranged in an alternating axial and 
±2° pattern, provide up to 96 measurements of the track 
position. Full radial coverage of the COT extends up to 
1 77 1 < 1 and of the silicon detectors up to |?7| < 2. 

A sampling calorimeter system arranged in a 
projective-tower geometry surrounds the magnetic 
solenoid and covers the region up to \rj\ < 3.6. The 
calorimeter is sectioned radially into lead-scintillator 
electromagnetic [TH] and iron-scintillator hadronic [TS] 
compartments. The central part of the calorimeter (|ry| < 
1.1) is segmented in towers spanning 0.1 in r\ and 15° in 
4>. The forward regions (1.1 < rj < 3.6) are segmented 
in towers spanning 0.1 to 0.64 in 77, corresponding to a 
nearly constant 2.7° in 6. The <j> segmentation of the 
forward regions is 7.5° for 1.1 < \rj\ < 2.11 and 15° for 
|ry| > 2.11. 

Drift chambers located outside the central hadronic 
calorimeters and behind a 60 cm thick iron shield detect 
muons with \r)\ < 0.6 Gas Cherenkov counters with 
a coverage of 3.7 < \r]\ < 4.7 measure the average number 
of inelastic pp collisions per beam crossing and thereby 
determine the luminosity |21j . 

III. DATA SAMPLE AND EVENT SELECTION 

This analysis is based on a data sample correspond- 
ing to an integrated luminosity of 2.6 fb _1 collected with 



the CDF II detector between February 2002 and July 
2008. The data are collected using a three-level trig- 
ger system. The first level requires two towers in the 
central calorimeter with Et > 5 GeV and two tracks 
with pt > 2 GeV/c reconstructed in the COT. The sec- 
ond level requires two energy clusters in the calorime- 
ter with E T > 15 GeV and \rj\ < 1.5 [2"2] , along with 
two tracks with pt > 2 GeV/c and impact parameter 
I do I > 100 /zm, characteristic of heavy flavor hadron de- 
cays, reconstructed using the level 2 silicon vertex trigger 
(SVT) system [23]. The third level confirms the level 2 
silicon tracks and calorimeter clusters using a variant of 
the offline reconstruction. No matching is required be- 
tween the tracks in the silicon tracker and the calorimeter 
towers or clusters in the trigger system. 

Due to the increasing Tevatron instantaneous luminos- 
ity profile, a higher-purity replacement for this trigger 
was implemented in July 2008 to stay within the con- 
straints imposed by the CDF data acquisition system. 
Because the analysis is so tightly coupled to the trigger 
requirements, analysis of the data collected after July 
2008 will require a separate dedicated study. 

The offline selection requires at least three jets with 
E T > 20 GeV and detector rapidity \r]\ < 2. The 
jets are reconstructed using a cone algorithm with ra- 
dius AR = V^+V < 0.7, and are corrected for 
calorimeter response and multiple interactions so that the 
energy scale reflects the total pr of all particles within 
the jet cone. In addition, only jets containing at least 
two tracks within a cone of AR = 0.4 around the jet 
axis satisfying the quality requirements of the displaced 
vertex-finding algorithm SECVTX [21] are considered. 
If more than three jets in the event satisfy these require- 
ments we consider up to the fourth leading jet in the 
event selection requirements (see below). Additional jets 
satisfying the requirements beyond the fourth leading jet 
are allowed but not used in the event selection. No veto 
is applied for additional jets not satisfying these cuts, but 
they are ignored when we order the jets by Et for the 
purpose of identifying the leading jets in the event. At 
least two of the three or four jets which are used for the 
event selection must match the positions of the calorime- 
ter clusters found by the second and third levels of the 
trigger in 7/ and cj). 

The signal sample for this search is defined by requir- 
ing that the two leading jets in the event and either the 
third or fourth leading jet be tagged as 6-jet candidates 
using SECVTX. The two leading jets in the event must 
also match the displaced tracks required by the level 2 
trigger selection. The track matching allows for the case 
where both tracks are matched to either of the two lead- 
ing jets, or where each of the two leading jets has one 
of the tracks matched. The matched track requirements 
bias the properties of the displaced vertices found by the 
SECVTX algorithm. Restricting the track matching to 
only the two leading jets simplifies the accounting of these 
biases at a cost of 20-25% in efficiency relative to allow- 
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ing the tracks to match any of the SECVTX-tagged jets 
in the event. 

We also select a superset of the triple-tagged signal 
region by requiring both of the two leading jets, or at 
least one of the two leading jets and either the third or 
fourth leading jet, to pass the SECVTX tag and level 2 
track matching requirements. This double-tagged sam- 
ple is the starting point for the background estimation 
procedure described in Sec. [V] 

We find 11 490 events passing the triple-tagged signal 
sample requirements. The double-tagged sample with 
both of the two leading jets tagged contains 267 833 
events, and the sample with at least one of the two lead- 
ing jets tagged and either the third or fourth jet tagged 
contains 424 565 events. 



FIG. 3. Cross section for bg — > Hsm + bj e t in the standard 
model calculated with mcfm. 



a 

o 




50 100 150 200 250 300 350 



m H.SM (GeV/c 2 ) 



IV. SIGNAL MODEL 



To compute the efficiency of this selection for neu- 
tral scalar signal events, the cross section of the process 
being searched for must be precisely defined. We use 
the MCFM program [25j to calculate the cross section for 
bg — > Hsm + &jet in the standard model. From this base- 
line the Higgs boson production rates in supersymmetric 
models are obtained by scaling the couplings EZ] ■ If 
there is a gluon in the final state along with the outgoing 
b quark (mcfm does not simulate the Higgs boson decay) 
and they are within AR < 0.4 of each other, MCFM will 
combine them into a "&j e t" ; otherwise the b quark alone 
serves as the jet. This bj et is the object upon which the 
kinematic cuts can be applied. 

We calculate the cross section for Hsm + bj e t in the 
SM, requiring > 15 GeV/c and \t]\ < 2 for the 
bj e t to match the acceptance of the SECVTX algorithm. 
We use CTEQ6.5M [28] parton distribution functions 
and set the renormalization and factorization scales to 
M-R ■ = Mf = (2m& + m,ff)/4 as suggested in Refs. [2"9"1 13TI] . 
The cross section obtained as a function of mH,SM is 
shown in Fig. [3] Cross sections at the level of a femto- 
barn are not discernable in this final state at the Teva- 
tron, so in the SM this process is of little interest. In 
the MSSM, however, simple tree-level scaling of the cou- 
plings and the degeneracy of the pseudoscalar A with one 
of the scalars h/H enhances this cross section by a fac- 
tor of 2 tan 2 f3. For tan (3 — 50 we therefore expect cross 
sections of picobarns or more at the Tevatron. 

The efficiency of the triple-tagged selection in events 
where the neutral scalar decays into a bb pair is 
determined from simulated data generated using the 
pythia [3T] Monte Carlo program and a full simulation 
of the CDF II detector [32] . We generate associated pro- 
duction of narrow scalars (specifically, SM Higgs bosons) 
with additional b quarks, and compare the kinematics of 
the events to the momentum and rapidity distributions 
predicted by the mcfm calculation. We find that the 
associated b jets (those not resulting from a Higgs bo- 
son decay) produced by pythia are more central than 



is predicted by mcfm, while the other event kinematics 
are in good agreement. We correct the pythia sam- 
ples to match the mcfm predictions by reweighting the 
events based on the pseudorapidity of the associated b 
jets. Further corrections are applied in order to match 
the efficiencies of the SECVTX algorithm and level 2 sili- 
con tracking requirements to those measured in the CDF 
data [24] . 

The event selection efficiencies vary from 0.3% to 1.2% 
as a function of the mass of the neutral scalar and are 
shown in Fig. [4] The efficiency of the offline requirement 
of three or more jets is 14-28%, the efficiency after adding 
the requirement of three or more SECVTX tags is 0.75- 
1.7%, and the final matching requirements of the tagged 
jets to the trigger clusters and tracks reduce the efficiency 
to 0.3-1.2%. For a cross section of 10 pb we therefore 
expect to select 80-310 signal events passing our require- 
ments. The mass of the two leading jets in the event 
mi2, which is used to separate signal from background, 
is shown in Fig. [5] for four values of the neutral scalar 



V. BACKGROUND MODEL 

Aside from the possible neutral scalar signal, the triple- 
tagged event sample is predominantly due to the QCD 
multijet production of heavy quarks. Other processes 
such as ti production and Z —¥ bb + jets are found to 
be negligible, a point to which we shall return in Sec- 



tion VII The heavy flavor multijet events arise from a 
large number of production mechanisms [9] for which the 
rates are not precisely known. The differing kinematics of 
each can produce different m\i spectra, which the back- 
ground estimation must accomodate. The myi spectrum 
of the background is also affected by biases introduced by 
the trigger and displaced-vertex tagging requirements. 

Heavy quark production can be categorized into three 
types of processes [§]: flavor creation, flavor excitation, 
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FIG. 4. Selection efficiency for bcj) events as a function of the 
neutral scalar mass m^. 
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FIG. 5. Distributions of m\2 for the simulated signal sam- 
ples. The lines simply connect the bins and do not represent 
parametrizations. All are normalized to unit area. 
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and gluon splitting. Flavor creation refers to cases where 
a pair of heavy quarks are created directly from the hard 
scattering process, i.e. qq —¥ bb + X where the additional 
activity X in the event is from initial or final state gluon 
radiation. Flavor excitation refers to processes with a 
heavy quark in the initial state which participates in the 
hard scattering, i.e. bq — > bq + X. Cases where the heavy 
quarks are not directly involved in the hard scattering are 
referred to_ as gluon splitting, i.e. qg — » qg + X followed 
by g — > bb where the heavy quark pair is produced as 
the gluon fragments. It is possible to obtain more than 
two heavy quarks in the final state by combining these 
processes in a single event, for example eg — > eg + X 
followed by g — > bb, or gg — » gg + X with both final 
state gluons splitting into bb pairs. Given the large num- 
ber of possible final states with multiple heavy quarks, 
each of which can be obtained through a variety of pro- 
duction mechanisms, estimating the multijet background 
by direct calculation is a complex undertaking with po- 



tentially large uncertainties. A data-driven background 
estimation of the mixture of processes directly from the 
signal sample itself is a more tractable problem, and the 
method that we adopt in this analysis. 

In order to qualitatively understand which of the many 
possible heavy quark final states are necessary to model 
with our data-driven method, and to what extent they 
differ in mi2, we begin with a study of simulated sam- 
ples of generic QCD multijet production. These samples 
are generated using the pythia program with MSEL=1 
(2^2 scattering where the outgoing partons can be 
gluons or quarks lighter than the top quark) and a sim- 
ple parametrization of the secondary vertex tagging effi- 
ciency which is a function of the Et, pseudorapidity, and 
quark flavor of the jets. We find in this study that more 
than 90% of the QCD background in our selected triple- 
tag sample consists of events with at least two 6-jets, with 
the additional tagged jet being any of a mistagged light 
jet or a correctly-tagged c-jet or third &-jet. 

In three-jet events with at least two fr-jets, the addi- 
tional jet is also a 6-jet roughly 2% of the time, a c-jet 4% 
of the time, and a light quark or gluon jet the remain- 
ing 94% of the time. These fractions hold when the two 
&-jets are either the two leading jets or if one of them is 
the third-leading jet. The flavor composition of the ad- 
ditional jet will ultimately be determined by fitting the 
data rather than using these estimates, however we will 
use them as starting points for the fit and also in the 
calculation of limits. 

We next focus on the m 12 spectrum in the subset of 
the pythia generator-level events described above with 
at least two 6-jets. We compare the spectrum in events 
with two &-jets and at least one other jet of any flavor 
to those in events where the additional jet(s) beyond the 
inital two 6-jets has a particular flavor (charm or another 
bottom jet). We find no significant differences between 
the flavor-inclusive spectrum and the flavor-specific ones. 
These results hold when splitting the generated sample 
by heavy quark production process, so the agreement is 
general rather than the result of a cancellation or partic- 
ular mix of processes. Changing the pythia hard scat- 
tering Q 2 scale factor parameter PARP(67) over the range 
of 1-4 as in Ref. [5] produces significant changes in the 
mi2 spectra, however the agreement between the flavor- 
inclusive and flavor-specific spectra is preserved as the 
changes in the underlying physics affect the two spec- 
tra in a similar way. In order to use pythia directly 
to estimate the myi spectrum of triple-tagged events we 
would need to know the "correct" values of PARP (67) and 
other parameters, but the similarity in m\2 shape be- 
tween double-tagged (flavor-inclusive) and triple-tagged 
(flavor-specific) events appears to be insensitive to the 
details of any particular pythia tuning. All of the rel- 
evant jet physics is therefore already contained in the 
double-tagged sample, which can be selected from data to 
remove dependence on event generators such as PYTHIA. 
The only correction necessary to use the double-tagged 
events as background estimates for the triple-tagged sam- 
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pie is the purely instrumental bias of requiring the third 
tag. 

Based on the results of the generator-level study, we 
conclude that 66 plus a third tagged jet of any flavor rep- 
resents more than 90% of the heavy flavor multijet back- 
ground. This is the basis of our background model; the 
effect of neglecting the 10% component with fewer than 
two 6-jets is discussed in Sec. |VfI| Because the properties 
of the additional jets in bb events do not depend strongly 
on the flavor of the jets, we can use the sample of double- 
tagged events described in Sec. |Iff| as a representation for 
all possible flavors of the third tags. 

The efficiency of requiring the third tag does depend 
upon the flavor, so we construct background estimates 
which depend on the flavor of the jet and its position 
in the i?T-ordered list of jets in the event. Splitting the 
background estimates in this way also provides flexibil- 
ity to accomodate mixtures of production processes. For 
example, events where the two leading jets are both 6- 
jets are more likely to result from flavor creation of bb 
than are events with the second- and third-leading jets 
both 6-jets, which have a larger contribution from a gluon 
splitting to bb and recoiling against another parton from 
the hard scatter. The normalizations of these flavor- and 
topology-dependent estimates will be determined from a 
fit to the data so as to minimize dependence on theoret- 
ical inputs. 

In the remainder of this section we show how we esti- 
mate the heavy quark multijet background from the large 
sample of data events with two 6-tags. 



FIG. 6. Efficiency to tag a jet with the indicated flavor as a 
function of the jet Et (a) and the number of tracks passing 
SECVTX quality requirements in the jet (b). Only jets with 
at least two quality tracks are included. The highest bin in 
each plot includes overflows. 
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A. The Double- Tagged Sample 



That the triple-tagged sample predominantly contains 
at least two 6-jets is of major importance. First, it re- 
duces the number of flavor combinations which must be 
considered to determine its composition to a manageable 
level. Secondly, samples of bb events with at least one ad- 
ditional jet are easily selected from the same dataset as 
the signal region and are therefore subject to the same bi- 
ases from the trigger and displaced-vertex tagging of the 
two 6-jets as the events in the signal region. By simulat- 
ing the effect of the SECVTX tag on the third jet, we can 
use the double-tagged sample to model all components of 
the triple-tagged sample with two or more 6-jets. Because 
we are going to determine the normalizations from a fit 
to the data, we need only to model the shape of the m\i 
spectrum for each background component. 

For moderate values of jet Et SECVTX becomes more 
efficient as jet Et increases, particularly for light-flavor 
jets where the false tag rate is highly dependent upon the 
number of candidate tracks in the jet which scales as the 
jet Et- For 6 and c quark jets the effect is less dramatic, 
and does not hold over the full range of Et- This effect is 
illustrated in Fig.[6j The drop in efficiency for 6 quark jets 
at higher Et is due to increasing track occupancy in the 
jets, which causes the silicon tracker to merge hits from 



different tracks resulting in lower-quality tracks which 
are rejected by the SECVTX requirements. Because of 
these variations of the efficiencies as a function of jet Et, 
requiring SECVTX tagged jets will bias the events to a 
different m\i spectrum than is observed in the parent, 
untagged sample. The double-tagged sample which is 
the starting point for our background estimates already 
includes the bias due to the two existing tags, so we must 
simulate only the bias which would be due to requiring 
the third tag as in the signal region. This is accomplished 
by weighting the events using efficiency parametrizations 
for 6, c, and light-flavor jets derived from large samples of 
fully-simulated pythia multijet events. The efficiencies 
are parametrized as a function of the jet Et and the num- 
ber of tracks in the jet passing the SECVTX quality cuts. 
As these efficiencies are derived from simulated samples, 
they are corrected to match the -Ey-parametrized effi- 
ciencies observed in the data using the same procedure 
employed for the simulated Higgs boson samples. 

We describe the flavor structure of the jets in the event 
in the form xyz, where xy denotes the flavor of the two 
leading jets and z is the flavor of the third-leading jet or 
fourth-leading jet in the case that the third-leading jet is 
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not tagged by SECVTX. For example, bqb would denote 
events where the two leading jets are a 6-jet (6) and a 
mistagged light quark (or gluon) jet (q), and the third 
tag is another 6-jet. Because our search variable m\2 is 
symmetric under the interchange of the two leading jets, 
we make no distinction between the leading and second- 
leading jets so that in a bqb event the gluon or light-flavor 
jet q could be either of the two leading jets. 

With this convention, we identify five types of event 
with at least two 6-jets. Three involve 6-jets in both of 
the leading jets: 666, 66c, and bbq. The other two, bcb 
and bqb, have the non-6-jet in one of the two leading jets. 
The distinction between the flavor content within the two 
leading jets and the flavor of the third jet is important, 
as the events will have differing kinematics and tagging 
biases when comparing bbq vs. bqb. In bbq events the two 
6-jets are likely to have originated directly from the hard 
scatter, while in bqb it is more likely that the two 6-jets 
come from a gluon splitting as mentioned above. The 
SECVTX algorithm is much more biased toward high- 
Et jets for light flavor than it is for 6-jets, so we expect 
that bqb events will have a harder m\i spectrum than bbq. 
Because we do not want to make any assumption about 
the rate of gluon splitting relative to bb flavor creation, 
we use both estimates and allow the fit of the data to 
determine the relative proportions. 



1. Corrections to the Double-Tagged Sample 

While our model assumes two 6-jets in each event, the 
generator-level study described above indicates that the 
double-tagged events have a contribution of ~ 10% where 
one or both of the tagged jets is a "mis-tagged" light fla- 
vor jet. We correct for this using events which have two 
displaced vertices, but where one or both of the vertices 
are on the opposite side of the primary vertex from the jet 
direction. These "negative" tags are predominantly fake 
tags from light-flavor jets and are a product of the finite 
position resolution of the tracking system. We expect 
there to be an equal number of fake tags from this source 
on the default, "positive" side, together with additional 
contributions of fake tags from Ks/A and interactions 
with the detector material which are not present in the 
negative tags. The negative tags also contain a small 
contribution from heavy-flavor jets which should be sub- 
tracted in order to obtain the positive fake rate. The 
total number of positive fake tags is found by scaling the 
negative tag count by a factor A = 1.4 ± 0.2 [24 which 
accounts for the effects described above and is measured 
from the data. We find no significant variation of A as a 
function of jet Et- 

We weight these events to simulate the third tag in the 
same way as the events with two default "positive" tags 
and then compute the number of true 66 events using 

N b - b = N++ - \N + _ + A 2 7V__ (1) 



FIG. 7. Distributions of mi2 used to construct the corrected 
bcb background estimate. The +c+ shape (the initial estimate 
with two positive tags, before correction) is shown with unit 
area. The +c-/-c+ shapes (starting from one positive and 
one negative tag) and -c- shapes (two negative tags) are shown 
with normalizations proportional to their area compared with 
+c+. For -c- a further scaling by a factor ten is applied to 
enhance visibility. The corrected estimate is reduced in area 
by ~10% relative to +c+. 
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where is the number of observed positive double- 
tags, JV-i is the number of events with one of the tags 

negative, and N is the number with both tags neg- 
ative. This relation can be understood by considering 

as the number of events with either one 6-tag and 

one fake tag or two fake tags. The two fake tag case will 
be double-counted by this estimate, because there are two 
permutations for which jet is the positive tag and which 

is the negative tag. Therefore the N term which is an 

estimate of the number of two fake tag events is added 
to correct for the double-counting. The A factors are in- 
serted to correct the negative tag rates into estimates of 
the total positive fake tag rates. 

This correction to subtract the non-66 component is 
applied bin-by-bin in m\2 when constructing estimates 
for all five of the background components. It reduces the 
normalization by around 10% and also softens the m\2 
spectrum in each estimate, because the samples with one 
or two negative tags will have harder m\i spectra than 
the sample with two positive tags due to the fake tag 
bias towards higher jet Et effect described above. The 
effect of the correction is illustrated in Fig. [7] for the 6c6 
background estimate. 



B. The Heavy Flavor Multijet Background 
Components 

We now describe in detail how each of the five model 
components, or "templates", for the three-tag back- 
grounds is constructed from the double-tag data. When 
referring to the templates we adopt the convention of 
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capitalizing the assumed flavor of the untagged jet, so 
that for the bbq background we would denote the tem- 
plate as bbQ. This distinction is most important for the 
bbb background as will be seen later. 



1. The bbc and bbq backgrounds 

Starting from the corrected double-tagged sample with 
the two leading jets tagged, we weight the events by the 
probability to tag the third jet if it were a c-jet or a 
light-quark jet to produce estimates for the bbc and bbq 
background components, respectively. If a fourth jet ex- 
ists we add the weights to tag either the third or fourth 
jet. 



2. The bcb and bqb backgrounds 

The templates for these backgrounds are constructed 
in essentially the same way as bbC and bbQ. The differ- 
ence is that we start from a double-tagged sample where 
one of the tags is in the third or fourth jet rather than 
requiring that both of the two leading jets be tagged as in 
bbQ/bbC ' . From there we subtract the non-66 component 
using Eq. [T] and weight the untagged jet within the two 
leading jets with either the charm-tag efficiency or the 
light-flavor jet mistag probability. The event selection 
requires that there be at least two level 2 trigger silicon 
tracks matched to the two leading jets, so for example 
in the bbQ template we require that the two leading jets 
contain at least two matched level 2 tracks (either at least 
one in each jet or at least two in one of the jets). For the 
bCb and bQb templates we require that only one of the 
two leading jets be tagged, and simulate the tag in the 
other jet. If the tagged jet has fewer than two matched 
level 2 silicon tracks, we use an efficiency parametriza- 
tion for the other of the two leading jets that represents 
not only the efficiency to tag the jet with SECVTX (as 
is used in the bbQ case, for example, to simulate the fake 
light-flavor tag of the third or fourth jet) but also for 
that jet to contain enough matched level 2 silicon tracks 
so that the total for the two leading jets is at least two. 
So, for example, if the leading jet is tagged and has one 
matched level 2 silicon track, we would weight the event 
by the combined efficiency to not only tag the second- 
leading jet with SECVTX but also to have matched at 
least one level 2 silicon track to it. In this way the effect 
of requiring at least two matched level 2 silicon tracks 
within the two leading jets is modeled. In the example 
with one matched track in the leading jet, there must be 
a second level 2 silicon track somewhere in the event for it 
to have passed the online trigger selection. We account 
for this by requiring that between the two tagged jets 
(the leading jet and either the third- or fourth-leading in 
our example) there must be at least two matched level 2 
silicon tracks. The requirement of the matched track in 
the third- or fourth-leading jet is not present in the signal 



sample, so this represents an unwanted bias. We remove 
the bias by additionally weighting these events by the ra- 
tio of the inclusive SECVTX 6-jet tag efficiency for the 
third or fourth jet to the efficiency for SECVTX tagging 
with matched level 2 tracks. 



3. The bbb background 

The third-tag weighting procedure works straightfor- 
wardly for the bbc and bbq backgrounds, because the b- 
quark production physics is the same as in the bbj events 
used as the starting point: the 6-jets in the double-tagged 
sample can be mapped directly to the signal region and 
the various bb production mechanisms are properly rep- 
resented. For the bbb background this is not the case, 
because there are two bb pairs present. Sometimes the 
two leading jets in the event are from the same bb pair, 
in which case a bbB template would be the appropriate 
choice because it is derived from events with the bb pair 
in the two leading jets. Other times the two leading jets 
are from a different bb pair, where a bBb template would 
be a better representation. 

The two methods of constructing a template for bbb 
have significantly different 777.12 distributions, which is 
due to the particular kinematics of bb production through 
gluon splitting. Gluon splitting produces bb pairs which 
tend to be less back-to-back than other production mech- 
anisms. When the two 6-jets in such an event are the two 
leading jets, as in the bbB template, we observe a softer 
mi2 distribution than is seen in bBb, where only one jet 
from the bb pair is within the two leading jets and the 
other of the two leading jets is an additional jet in the 
event against which the bb system is recoiling. 

The pythia simulation indicates that the 777,12 spec- 
trum for bbb events lies between the two estimates bbB 
and bBb, as shown in Fig. [8] The difference between the 
two estimates is largest for events involving only gluon 
splitting, but the relationship also persists across other 
heavy-flavor production mechanisms. We conclude that 
regardless of the relative rates of bb production processes, 
the bbb background can be derived from an interpolation 
between the two templates bbB and bBb. We include 
both in the fit and let the data determine the proper 
weighting. 

4- Backgrounds summary 

The full set of background fit templates for TO12 is 
shown in Fig. [9j Because they are too similar to discrim- 
inate in the fit, we use an average of the bbC and bbQ 
templates which we denote bbX. The backgrounds with 
two heavy flavor jets in the leading jet pair have sim- 
ilar 77ii2 distributions. Because the false tag rate rises 
with jet Et more rapidly than does the 6-tag or c-tag 
rate, the bQb displays a harder spectrum than bCb or 
bBb even though they are derived from the same events. 



11 



FIG. 8. Distributions of mi2 from the generator-level pythia 
study, for simulations of the double-tagged background tem- 
plates bBb and bbB, compared to events with three true &-jets. 



FIG. 10. The tag mass mtag for different jet flavors, from 
the CDF simulation. All distributions are normalized to unit 
area. 
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FIG. 9. Distributions of mi2 for the background fit templates. 
The lines simply connect the bin centers and do not represent 
parametrizations. The bBb template is obscured by bCb be- 
cause they have nearly the same mi2 distribution. All are 
normalized to unit area. 
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C. Fitting the Model to the Data 

Our search will examine the m 12 distribution for an en- 
hancement riding atop the continuum background. The 
search will be done using a simultaneous fit for the nor- 
malization of six distributions: one neutral scalar model 
of varying mass, and the five background templates that 
together will model the background. A fit in m\i alone 
is challenged by the fact that the background templates 
peak at similar mass but have different widths, as seen 
in Fig. [9] A possible signal riding on the falling edge 
of the background above 100 GeV/c 2 could therefore be 
fitted by adding additional contribution from the wide 
bQb distribution, for example, resulting in loss of sensi- 
tivity. This effect can be mitigated by adding another 
variable which is sensitive to the differing flavor content 



of the templates, which we call Xtags- In this section we 
describe the Xtags variable and then examine the abil- 
ity of our background model alone to describe the data 
without any contribution from the neutral scalar signal 
model, using a two-dimensional fit of the distributions of 
TO12 and x tags ■ 



1. The Flavor Dependent Variable x tag s 

Because we are going to fit the m\2 spectrum of the 
triple-tagged data with our background templates, each 
of which has its own characteristic myi spectrum, it is 
useful to have a second method with which to constrain 
the relative fractions of each background template and 
obtain a firmer prediction of the overall background ran 
spectrum. The Xtags variable should be sensitive to the 
flavor of the tagged jets using information independent 
of mi 2 . 

The observable chosen as the basis of Xtags is rnt ag , 
the invariant mass of the tracks which constitute the sec- 
ondary vertex as determined by SECVTX. This reflects 
the masses of the underlying heavy flavor hadrons and is 
sensitive to the flavor of the jet as shown in Fig. [10] We 
define the quantity x tags (mi M g,m 2 ^ag, , m 3M g) 1 where 
Wi.tag is the mass of the tracks forming the displaced 
vertex in jet 1, 2, or 3. 

For example, as mentioned above we expect that bqb 
events will exhibit a harder spectrum than bbq due to the 
bias from the fake tag in one of the two leading jets. The 
xtags variable should therefore be constructed so that it 
is sensitive to the presence of a charm or fake tag in one 
of the two leading jets, using m l tag and m^^ag- If these 
events were removed, we would be left with backgrounds 
where the two leading jets are both 6-jets and the third 
leading jet is any flavor. The case where the third jet 
is also a 6-jet constitutes an irreducible background to 
the potential neutral scalar signal in the Xtags spectrum, 
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FIG. 11. Illustration of the x ta gs definition. All axes are in 

„2 




'3,tag 



lags 



because the signal is also three 6-jets. However, the back- 
grounds where the third jet is a charm or fake tag (bbc 
and bbq) can be separated from the bbb cases using m^^ag- 
Because we make no distinction between the two lead- 
ing jets in our flavor classification scheme, we construct 
xtags to be symmetric under their interchange, as is m\ 2 . 
We are interested only in the flavor combination of the 
pair. We choose a simple sum m\^ ag + mi,tag to sat- 
isfy this constraint. Combined with the information from 
Tn3,tag we have a two-dimensional distribution, however 
we want to reduce this to a single variable so that when 
combined with m.12 we are left with two-dimensional fit 
templates. To this end we define the Xtags variable as 



mm(m 3t t a g, 3) 
./■,„,,, { mini ,„,,... 'ii + 3 

min(m 3!tag ,3) + 6 



mi M g + m 2 ,tag < 2 
2 < mx,tag + m 2 ,tag 
m-Ljag + m 2 ,tag > 4 



< 4 



(2) 

where min(a, b) returns the minimum of a and 6, and all 
quantities are in units of GeV/c 2 . The net effect is to un- 
stack a two-dimensional histogram of mx^ ag +iri2,tag ver ~ 
sus m^ t tag into the one-dimensional variable Xtags, as il- 
lustrated in Fig. |ll| The mi i tag~\~ m 2,tag axis provides the 
sensitivity to frcoand bqb versus the other backgrounds, 
and the m^^ag separates out bbc and bbq. 

In order to compute Xtags for the background tem- 
plates we need to simulate not only the efficiency of 
the third tag for each event, but also its expected m ta g 
spectrum. This is done by extending the tag efficiency 
parametrization so that it is a function of the jet Et, the 
number of quality tracks, and mtag ■ The parametrization 
can then be considered to represent the probability to tag 
a jet with a given Et and number of quality tracks and an 
assumed flavor of q, c, or 6, and for that tag to have a par- 
ticular tag mass m ta g- Projections of this parametriza- 
tion onto the m ta g axis for particular values of Et and 
number of tracks are shown in Fig. |12| For each event we 
iterate over all bins of mtag in the parametrization for the 
simulated third tag, compute the corresponding Xtags for 
each bin, and build up the background templates using 
the parametrization to estimate the approriate weight for 
each value of mtags in the third tag. Each event will con- 
tribute to only a single bin in m\ 2 but can fill multiple 
bins in x tags as we iterate over the bins of m ta g for the 
third tag. 



FIG. 12. Projections onto the mtag axis of the tag probability 
parametrization, for jets with 80 < Et < 100 GeV and the 
indicated numbers of quality SECVTX tracks, for b jets (a) 
and fake tags of light-flavor jets (b). The area of each his- 
togram indicates the total tag probability for this slice of Et 
and number of tracks. 
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Distributions of Xtags for all of the background com- 



ponents are shown in Fig. 13 The backgrounds separate 
into three groups in this variable, with bCb and bQb more 
prominent in the bins of lower Xtags , bbB and bBb more 
prominent in the higher Xtags bins, and bbX with a dif- 
ferent shape due to the non-6 flavor of the tag in the 
third leading jet in those events. A neutral scalar signal, 
because it contains three 6-jets, would look very similar 
to the bbB and bBb backgrounds in Xtags- 



2. Background Normalization Predictions 

Our background model requires information only on 
the shapes of the various templates, with the normaliza- 
tions determined from a fit to the data. However, it is 
possible to obtain a priori estimates of the normaliza- 
tions, which can be used as starting points for the fit, 
using our templates and inputs from the generator-level 
pythia study discussed at the beginning of this Section. 
As constructed, the templates have total area equal to 
the number of bb+jet events, multiplied by the average 
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FIG. 13. Distributions of Xtags for the background fit tem- 
plates. All are normalized to unit area. 











bbB 










bBb 










bbX 








x \ _ _ T _ _ 


bCb 








t \ — 0— 

\ * 


bQb 




' .<▼. \ 


y 

-3 , i , , , , i 


\ v 

\ \,./\ 


■*v, 




... ■ * . 

, , , /i> 


V/ V-*"'" 





I- >^ g »_u 

012345678S 

",a gs (GeV/c 2 ) 



efficiency to tag the additional jet over the entire &&+jet 
sample as if it were always a light-flavor, charm, or bot- 
tom jet depending on the assumed flavor. All that re- 
mains in each case is to multiply by the fraction of events 
where the jet truly has the assumed flavor. For the charm 
and bottom cases, the PYTHIA study indicates fractions 
of 4% and 2%, respectively. 

For the light-flavor cases bbQ and bQb, we use the ob- 
served numbers of events with one or more negative tags 
to estimate the normalizations of these components by 
extending the calculation in Eq. [T] to the case of three 
tags 

N bbQ = N++- - AA/+__ + A 2 A^__ (3) 
N bQb = N + _+ - A(A/ + __ + JV__ + ) + A 2 A/___ (4) 

where the N are the numbers of observed events with the 
indicated positive/negative tag patterns. In the case of 
the two leading jets containing a positive and a negative 

tag, for example A^_| |_, the negative tag can be in either 

the first or second leading jet. The factor A is the same 
fake tag asymmetry factor used in Eq. [T] 

We emphasize that these estimates are never used as 
constraints in any fits; the normalizations of the back- 
ground components are always derived strictly from the 
data sample itself without any theoretical input on jet 
flavor fractions. We will however use these a priori es- 
timates as starting points in Sec. |VII| for estimating the 
sensitivity of our search. 



3. Background- Only Fit to the Data 

We fit the background and signal templates to the data 
using a binned maximum-likelihood fit. The likelihood 
function is a joint probability of the Poisson likelihood 
for each bin ia.*-' e~ Vij / 'ny! , where is the number of 



TABLE I. Numbers of fitted events for each background type, 
compared to the estimates derived from the pythia heavy 
flavor fractions. 



component 


estimate 


Nfit 


bbB 


1300 


1520±540 


bBb 


2950 


2620±550 


bbX = bbQ + bbC 


1350+640 = 1990 


2210±160 


bCb 


1380 


1710±630 


bQb 


3480 


3430±390 



observed events in the i-th bin of m\i and the j-th bin 
of Xtags > and the expectation in that bin My is given by 

V%j = ^2 ^/Mj + N sfs,ij (5) 
b 

where b represents the five background templates, fb.ij 
and f St ij are the bin contents of the various backgrounds 
(/b) and of the neutral scalar signal (f s ), and the five 
Nb and optionally N s are the free parameters of the fit 
which represent the normalizations of each component. 
We normalize all background and signal templates to unit 
area when performing this fit, so that the Nb and N s 
parameters will correspond to the numbers of events in 
the sample assigned to each template. 

Fig. [fi] shows the result of a fit of the 11 490 triple- 
tagged events observed in the data using only the back- 
ground templates (N s fixed to zero) and with no system- 
atic errors. Only the projections onto each axis are shown 
for clarity. The post-fit x 2 /dof between the observed 
data and best-fit background is 185.8/163 = 1.140. The 
numbers of fitted events for each background type are 
given in Table [I] and compared to the predictions derived 
from the PYTHIA jet flavor fractions. Good agreement 
is observed for all background components. This com- 
parison does not demonstate the ability of pythia to 
correctly model the m\2 spectrum observed in the data, 
it tests only the overall numbers of events predicted for 
each flavor composition but not their kinematics. The 
good agreement between the fitted number of bQb events 
and the data-driven prediction of the normalization does 
indicate that we are not missing any sizeable background 
component with fewer than two 6-jets, because that com- 
ponent would be expected to show up at higher m\2 val- 
ues due to the bias towards high-Sy jets produced by 
fake tags. 

In order to fully judge the quality of the background- 
only fit and whether it adequately describes the data, we 
require a framework that allows for the introduction of 
systematic uncertainties. We also need to be able to cal- 
culate the significance of any possible signal contribution 
after accounting for systematic uncertainties. The proce- 
dure we adopt uses ensembles of simulated experiments, 
where the simulated experiments include the effects of 
systematic uncertainties and the fitting procedure is the 
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FIG. 14. Fit of the triple-tagged data sample using only the 
QCD background templates, in the mi2 (a) and x t ags (b) 
projections. The differences between the data and the fit 
model are shown in the lower section of each figure. 
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same as described above. We describe the systematic un- 
certainties which we consider in the next Section and the 



simulated experiments procedure in Sec. VII 



VI. SYSTEMATIC UNCERTAINTIES 



Several sources of systematic uncertainty on the signal 
and background contributions are considered. A sum- 
mary is shown in Table [TT] Modeling uncertainties can 
affect both the normalization of the fit templates (de- 
noted 'rate' in the Table) and the distributions of ran 
and Xtags (denoted 'shape'). Shape uncertainties are in- 
troduced by modifying the templates using an interpola- 
tion procedure [33] . 

Rate uncertainties on the signal contribution relate to 
the number of signal events expected for a given cross 
section. They include the integrated luminosity of the 
data sample, the statistical errors due to the finite size of 
the simulated signal samples, the efficiency of the trigger 
and SECVTX tagging requirements, and the effect on 
the efficiency due to uncertainties on parton distribution 
functions (PDFs) . For the PDF uncertainty we apply the 
twenty eigenvector variations of the CTEQ 6.5M set. 

Modeling of the energy scale of jets introduces uncer- 
tainties both on the acceptance for signal events to pass 
the event selection and on the m\2 spectrum of these 
events. No energy scale modeling uncertainty is assigned 
to the background templates since they are derived from 
the data. 

The Xtags variable introduces an uncertainty due to 
modeling of the m tag spectrum of the SECVTX displaced 
vertices. This uncertainty affects only the shape of the 
xtags distribution and has no effect on the estimated sig- 
nal acceptance. For the simulated signal events, all three 
SECVTX vertex masses are varied, while for the back- 
grounds only the mass of the simulated third tag in the 
event is varied because the other two tag masses in each 
event come directly from the data. 

Varying the value of A used to subtract the non-bb 
component from the double-tagged events changes the 
shapes of the resulting corrected background templates, 
and also the predicted normalizations of the bbQ and bQb 
templates. 

We assign 50% uncertainty to the 2% (b) and 4% (c) 
jet flavor fractions from pythia which are used to obtain 
the a priori normalization estimates of the background 
components. This variation is used only when throwing 
the simulated experiments used to estimate the sensitiv- 
ity. It is not used to constrain any of the templates in 
the fits. The results are largely insensitive to the size of 
this variation, so long as it is large relative to the pre- 
cisions obtained on each template from the fit but not 
so large that it causes the simulated experiments to of- 
ten have zero contribution from any of the background 
components. When performing the variation we assume 
that bbB and bbC are 100% correlated because they are 
likely to involve the same underlying physics processes; 
the same holds for bBb and bCb. No correlation is as- 
sumed between bbB and bBb or bbC and bCb. 
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TABLE II. Summary of systematic uncertainties. 



variation 



applies to 



type 



luminosity 
Monte Carlo statistics 
selection efficiency 
PDFs 
jet energy scale 
b/c mtag 
mistag mtag 
mistag asymmetry factor A 
heavy flavor fractions 



±6% 
±2% 
±5% per jet 

+3.5 % 
-4.5 /0 

±4.5% 
3% 
3% 
1.4 ± 0.2 

±50% 



signal 

signal 

signal 

signal 

signal 
signal/backgrounds 
backgrounds 
backgrounds 
backgrounds 



rate 
rate 
rate 
rate 
rate/shape 
shape 
shape 
rate/shape 
rate 



VII. RESONANCE SEARCH IN THE 
TRIPLE- TAGGED DATA 

We perform fits of the data using the background tem- 
plates and templates for a neutral scalar in the mass 
range of 90-350 GeV/c 2 . These fits are identical to the 
one shown in Fig. [14] except that in addition to floating 
the background normalizations we also release the con- 
straint on the template representing a possible resonant 
component of the data. We use a modified frcquentist 
CLs method [3D to compute the sensitivity and set 95% 
confidence level upper limits on the cross section for pro- 
duction of a narrow scalar as a function of mass. We 
compare the data to the best-fit background plus signal 
model for the mass point with the most significant ex- 
cess. Finally, we interpret our results as limits on tan/3 
in the MSSM as a function of the pseudoscalar Higgs bo- 
son mass niA, including the effects of the Higgs boson 
width. 



A. Cross section times branching ratio limits 

The limit calculations are performed using a custom 
program based on the mclimit package [35] . It performs 
the fitting of the background and signal templates using 
either the observed data or simulated experiments, and 
calculates confidence levels using the CL S method. The 
test statistic employed is the difference in \ 2 between fits 
using only the background templates and fits using both 
background and signal templates. 



B. Simulated Experiments 

Simulated experiments are generated based on the 
background predictions in Table |T] The number of sig- 
nal events generated depends on the assumed a x BR, 
the integrated luminosity, and the acceptance shown in 
Fig. [4] The predictions for the numbers of each back- 
ground type and for the signal are randomly varied for 
each simulated experiment according to the systematic 



uncertainties shown in Table [TTJ The distributions of m\i 
and Xtags are also randomly varied using histogram in- 
terpolation. The resulting background and signal tem- 
plates are summed to obtain estimates for the number 
of events in each bin of TO12 and x ta g S - These are in- 
put to a Poisson random-number generator to produce 
integer bin counts for the simulated experiment with the 
appropriate statistical variations. These are fit using the 
default background and signal templates to build prob- 
ability densities of the test statistic for various values of 
a x BR. The fits of either the observed data or simulated 
experiments always use the unmodified templates. The 
systematic uncertainties are only applied when building 
the simulated experiments. 



C. Limit Results 

The median expected limits on a x BR for statistical 
errors only and with full systematic uncertainties applied 



are shown in Table III along with the observed limits. 
The systematic uncertainties increase the limits by 15- 
25% relative to the no-systematics case. 

The expected and observed limits for the full system- 
atics case are plotted as a function of the narrow scalar 



mass in Fig. 15 Also shown are the bands resulting from 
calculating the expected limits using the ±ltr and ±2<r 
values of the test statistic from simulated experiments 
containing no signal. We observe a positive deviation of 
greater than 2a from the expectation in the mass region 
of 130-160 GeV/c 2 . The most significant discrepancy is 
at m = 150 GeV/c 2 , with a l-CL b p-value of 0.23%. 
Including the trials factor to account for the number of 
mass points searched, we expect to see a deviation of this 
magnitude at any mass in the range which we test (90-350 
GeV/c 2 in steps of 10 GeV/c 2 ) in 2.5% of background- 
only pseudoexperiments. 

The results of the fit of the observed data for a narrow 

and Ta- 
ble : 



sca lar mass of 150 GeV/c 2 are shown in Fig. 

~~ In this case the x 2 /dof is 171.2/162 = 17057, with 



IV 



the fit assigning 420±130 events to the signal template. If 
interpreted as narrow scalar production this corresponds 
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TABLE III. Median expected and observed limits on a(pp — > 
b<t>) x B((f> -> bb), in pb. 





no systematics 


full systematics 


observed 


90 


39.8 


48.8 


26.4 


100 


41.3 


50.8 


32.6 


110 


22.7 


28.0 


27.8 


120 


20.1 


23.0 


34.5 


130 


13.4 


15.5 


28.8 


140 


12.0 


13.8 


33.8 


150 


9.2 


10.7 


28.0 


160 


8.1 


9.1 


22.2 


170 


6.3 


7.3 


16.7 


180 


6.0 


6.7 


11.6 


190 


5.2 


6.1 


7.7 


200 


4.9 


5.5 


6.4 


210 


4.3 


4.9 


5.1 


220 


4.1 


4.6 


5.0 


230 


3.6 


4.2 


4.8 


240 


3.5 


4.1 


5.1 


250 


3.2 


3.9 


4.9 


260 


3.1 


3.7 


4.9 


970 


9 Q 


o.o 


A 7 


280 


2.9 


3.4 


4.5 


290 


2.7 


3.3 


4.4 


300 


2.7 


3.2 


4.3 


310 


2.5 


3.3 


4.9 


320 


2.5 


3.1 


4.7 


330 


2.7 


3.1 


4.8 


340 


2.5 


3.0 


4.8 


350 


2.5 


3.3 


5.6 



TABLE IV. Numbers of fitted events for each background 



type and for a narrow scalar signal with = 150 GeV/c 2 . 



component N/ it 

bbB 2280±600 

bBb 1490±670 

bbX 2150±160 

bCb 2050±630 

bQb 3100±400 

Higgs 420±130 



to a cross section times branching ratio of about 15 pb 
within our Higgs-like production model. 



D. Checks of the Background Model 

Several checks are made to investigate if the slight ex- 
cess in the 140-170 GeV/c 2 mass region might be due to 
a neglected background contribution or mismodcling of 
one or more of the background templates. 

One possible explanation is the effect of neglecting the 
component of the multijet background with fewer than 



FIG. 15. Median, la, and 2a expected limits, and observed 
limits on narrow resonance production versus on linear 
(a) and logarithmic (b) scales. 
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two 6-jets. The components with at least two charm 
jets are found to be accommodated by residual cc con- 
tributions in the double-tagged sample used to construct 
the background estimates. To check the effect of back- 
grounds with at least two falsely-tagged light-flavor jets, 
we introduce a template into the fit derived from events 
with one positive and two negative tags. We find the 
fit prefers to assign <~1% of the sample to this template, 
with slightly reduced fit quality as determined from the 
X 2 /dof (170.5/161 = 1.059). The change in the fitted 
excess is positive and less than 5%. 

We return to the question of tt pair production and 
Z — »■ bb + jets backgrounds, which are neglected in the 
fit. We expect around 30 and 100 events from these 
sources, respectively. The Z — > bb + jets background 
would not need to be explicitly included in the fit even 
if it were much larger, because it is already represented 
in the double-tagged events used to construct the back- 
ground templates. The jets which accompany Z — > bb are 
similar to the jets in multijet bb + jets events, so the frac- 
tion of Z — > bb + jets in the double-tagged background 
sample is correctly translated into the correct fraction to 
account for the Z bb contribution to the triple-tagged 
signal sample. The tt contribution is also partially ac- 
counted for by this mechanism, although the jet flavor 
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FIG. 16. Fit of the triple-tagged data sample using the QCD 
background templates and the signal template for = 150 
GeV/c 2 , in the mi2 (a) and x taS s (b) projections. The dif- 
ferences between the data and the fit model are shown in the 
lower section of each figure. 
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composition is not as similar due to enhanced charm pro- 
duction from W — » cs decays. Due to the smallness of 
the overall ti contribution the remaining contribution can 
safely be neglected. 

Mismodeling of the instrumental bias introduced by 
simulating the effect of the third tag could distort the 
backgound templates and produce an apparent excess. 



We test our sensitivity to this effect by replacing the Et- 
dependence of the SECVTX tag efficiency for 6-jets which 
we measured in the data with that predicted by our full 
detector simulation. This change is much larger than the 
precision with which we measure the -©^-dependence in 
the data. Fitting the data with these modified templates, 
we find changes to the normalizations of the individual 
background templates of 50-100 events and virtually no 
change in the summed best-fit background model. We 
perform a similar test on the .Ey-dependence of the false 
tag rate used to construct the bbQ and bQb templates, in 
this case replacing the Ey-dependence from the full de- 
tector simulation with an estimate derived from negative 
tags in the data. Fitting with these modified templates 
we find changes in the background normalizations consis- 
tent with statistical fluctuations, and again little change 
in the total background model. 



E. MSSM interpretation 

To interpret the data in MSSM scenarios, we must 
know the production cross section for Higgs boson events 
with a given pseudoscalar mass uia as a function of tan /3. 
At tree level this can be computed [27] as 



ctmssm = 2 x a S M x tan 2 j3 x 0.9 



(6) 



where a$M is the standard model cross section for a Higgs 
boson of mass uia, the factor of two reflects the degener- 
acy between A and h/H, and 0.9 is the branching ratio 
B(A^bb). 

In order to go beyond tree level, we must consider the 
effects of loop corrections which can enhance the cross 
section by more or less than tan 2 j3 depending upon the 
MSSM scenario. We must also include the effects of the 
Higgs boson width which can become significant when 
the down-type couplings are enhanced by such large fac- 
tors. This means that not only the amount of signal ex- 
pected but also the properties of that signal such as the 
reconstructed ui\i spectrum will change depending upon 
the value of tan j3 in the scenario under consideration. 

In Refs. [26l [27] an approximate expression for the 
cross section times branching ratio for Higgs boson pro- 
duction in the MSSM, including loop effects, is given as: 



a(bb(f>)xB{A -> bb) ~ 2a{bb<f)) 



SM 



tan 2 /3 



(1 



A 6 ) 2 



(l + A b )2 + 9 



(7) 

where <j> is a Higgs boson (either the SM variety or one 
of h/H/A), <j(bb<f)sM is the SM cross section, the factor 
of two comes from the degeneracy of A with either h 
or H, and the loop effects are incorporated into the Af, 
parameter. For our purposes it is important only to note 
that Af, is proportional to the product of tan /? and the 
Higgsino mass parameter /i. Sample values of Af, given in 
Ref. [57j are -0.21 for the m™ ax scenario and -0.1 for the 
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no-mixing scenario (at /i = —200 GeV and tan/3 = 50). 
It is apparent that negative values of /_t and hence of A;, 
will increase the MSSM Higgs boson yield at a given tan /3 
above the tree level values and result in stronger limits 
on tan /3, while scenarios with /i positive will produce the 
opposite effect. Using Eq. [7] we can predict the Higgs 
boson yield for any value of tan /3 and and therefore 
derive limits in any desired scenario. 

The limits shown in Fig. [15] apply only to narrow 
scalars such as the standard model Higgs boson. If the 
cross section is increased by scaling the bb<f> coupling, as 
happens in the MSSM, then the width of the Higgs bo- 
son will increase as well. In order to account for this we 
convolute the cross section shown in Fig. [3] with a rela- 
tivisitic Breit-Wigner to produce cross section lineshapes 
for various values of the Higgs boson pole mass, tan /3, 
and Afc. Parametrizations of the partial widths T b i and 
r rr as functions of rriA and tan /3 are obtained from the 
FEYNHIGGS [36] program, with T b i also dependent on A;,. 

Changing the width of the Higgs boson also changes 
the total cross section as a function of the pole mass. 
We integrate the broadened cross section described above 
for m0 > 50 GeV/c 2 (where the acceptance for a narrow 
Higgs drops to zero) and divide by the cross section value 
expected for a narrow Higgs to derive a correction fac- 
tor. This factor ranges from 1.0-0.8 for pole mass of 90 
GeV/c 2 to 1.0-1.1 for 180 GeV/c 2 , for tan /3 from 40-120. 
The factor drops below 1 for low pole masses because 
part of the broadened cross section falls below the cutoff 
at 50 GeV/c 2 . This information is needed when com- 
puting the expected number of events for a given Higgs 
boson mass and tan/3 value in the limits calculator. 

Fit templates for the Higgs boson signal as a function 
of tan /3 are constructed by combining the narrow-width 
templates, weighted by the lineshapes and by the accep- 
tance parametrization shown in Fig. [4] We scan over 
tan /3 in steps of 5 and calculate CL S at each point, and 
exclude regions with CL S > 0.05. The limits obtained 
are shown in Fig. [17] for Af, = 0. The sensitivity be- 
gins to degrade rapidly for Higgs boson masses above 
180 GeV/c 2 , where the values of tan/3 required to pro- 
duce an observable cross section result in an 77112 spec- 
trum that no longer displays a mass peak due to the large 
width of the Higgs boson. 

Along with the A& = case, limits are also generated 
for the m™ ax scenario with fi = —200 GeV and are shown 
in Fig. [18] Because of the relatively large and negative 
values of A& in this scenario, the tan/3 limits are much 
stronger because we expect many more signal events for 
a given tan/3 relative to the A& = case. In both cases 
the observed limits in the mass range 120-170 GeV/c 2 
are slightly above the 2a band, due to the excess of data 
over the background model in this region. 



FIG. 17. Median, la, and 2a expected limits, and the ob- 
served limits versus rriA, including the Higgs boson width and 
for A b = 0. 
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FIG. 18. Median, la, and 2a expected limits, and the ob- 
served limits versus rriA, including the Higgs boson width, for 
the m% ax scenario with /_i = -200 GeV. 
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VIII. CONCLUSION 

A search for resonances produced in association with 
6-quarks is performed in triple-6-tagged three- or four-jet 
events, using 2.6 fb -1 of pp collisions from the Tevatron. 
This process could be present at a measurable rate in 
supersymmetric models with high values of tan (3. We use 
the mass of the two leading jets and jet flavor information 
from the secondary vertex tags to fit for a Higgs boson 
component within the heavy flavor multijet background. 

We find the data are consistent with the background 
model predictions over the entire mass range investi- 
gated. The largest deviation is observed in the mass 
region 140-170 GeV/c 2 , where data show an excess 
over background with a significance of 0.23% (2.8cr) at 
150 GeV/c 2 . If this excess were to be attributed to the 
production of a narrow resonance in association with a 
&-jet with kinematics characteristic of Higgs boson pro- 
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duction, it would correspond to a production cross sec- 
tion times branching ratio of about 15 pb. We esti- 
mate the probablility to observe such a deviation at any 
mass in the range 90-350 GeV/c 2 at 2.5% (1.9a-). Below 
140 GeV/c 2 and above 170 GeV/c 2 the limits are within 
2a of expectations. 

The DO experiment published results for a similar 
search as the one performed here in Ref. [I]. That 
analysis uses a multivariate selection and discrimination 
procedure tuned to the MSSM Higgs boson hypothesis, 
whereas here a more general resonance search is per- 
formed. 

The data are used to examine two MSSM scenarios. 
In the case where loop effects are small, we find that the 
growth of the Higgs boson width as the couplings are 
enhanced permits only weak limits of tan /3 > 250 in the 
mass region around 150 GeV/c 2 . In the m™ ax scenario 
with fi negative, the enhanced production through loop 
effects allows exclusion of tan/3 values greater than 40 
for m,A = 90 GeV/c 2 and about 90-140 for the mass 
range 110-170 GeV/c 2 . The results in Ref. (4] exclude 
values of tan (3 in the same m™ ax with \i negative scenario 
considered here above 50-60 over this mass range. 

The MSSM study allows comparison with the results in 
the A —> tt channel [ME], which are much less sensitive 
to the details of the MSSM scenario. The tt analyses 
exclude values of tan/3 above 25-35 in the mass range 
from 90-200 GeV/c 2 . Any interpretation of the observed 
excess in the results presented here in terms of MSSM 
Higgs boson production would therefore be restricted to 



scenarios with large negative values of the Higgsino mass 
parameter fi, where the event yield in the bb decay mode 
for a given value of tan j3 is enhanced. 
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